Interaction between Record Matching and Data Repairing

نویسندگان
چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Interaction between Record Matching and Data Repairing

Central to a data cleaning system are record matching and data repairing. Matching aims to identify tuples that refer to the same real-world object, and repairing is to make a database consistent by fixing errors in the data by using integrity constraints. These are typically treated as separate processes in current data cleaning systems, based on heuristic solutions. This paper studies a new p...

متن کامل

Adaptive Approximate Record Matching

Typographical data entry errors and incomplete documents, produce imperfect records in real world databases. These errors generate distinct records which belong to the same entity. The aim of Approximate Record Matching is to find multiple records which belong to an entity. In this paper, an algorithm for Approximate Record Matching is proposed that can be adapted automatically with input error...

متن کامل

Record Matching to Improve Data Quality

Data Quality is defined in [TB9SJ as fitness for use, which implies that quality is relative to the use of data. Problems with data quality tend to fall into two categories: inconsistency among systems and inconsistency with reality. Format/syntax, semantic and value inconsistencies are representative of inconsistency among systems whereas incorrect and missing values are representative of inco...

متن کامل

adaptive approximate record matching

typographical data entry errors and incomplete documents, produce imperfect records in real world databases. these errors generate distinct records which belong to the same entity. the aim of approximate record matching is to find multiple records which belong to an entity. in this paper, an algorithm for approximate record matching is proposed that can be adapted automatically with input error...

متن کامل

Unsupervised record matching with noisy and incomplete data

We consider the problem of duplicate detection: given a large data set in which each entry has multiple attributes, detect which distinct entries refer to the same real world entity. Our method consists of three main steps: creating a similarity score between entries, grouping entries together into ‘unique entities’, and refining the groups. We compare various methods for creating similarity sc...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Journal of Data and Information Quality

سال: 2014

ISSN: 1936-1955,1936-1963

DOI: 10.1145/2567657